Diagnostic method

ABSTRACT

PCT No. PCT/GB93/01520 Sec. 371 Date Apr. 7, 1995 Sec. 102(e) Date Apr. 7, 1995 PCT Filed Jul. 20, 1993 PCT Pub. No. WO94/02633 PCT Pub. Date Feb. 3, 1994There is marked over-expression of multiple spliced variants of the CD44 gene in tumor compared to counterpart normal tissue. This observation forms the basis of a method of diagnosing neoplasia by analysis of a sample of body tissue or body fluid or waste product. A new exon 6 of 129 bp has been located and sequenced, and is claimed as such and for use in a diagnostic method.

BACKGROUND

The present invention is concerned with using expression of the CD44gene or part of the CD44 gene to investigate neoplasia. Suchinvestigation includes taking a tissue, body fluid or other sample froma patient to perform diagnosis, to give a prognosis or to evaluatetherapy that is already being carried out. In particular, the inventionprovides a simple method for carrying out routine screening forneoplasia using body fluid samples or other samples which can beobtained non-invasively.

The usual way to diagnose a tumour at present is by looking at cells orthin slices of tissue down a microscope, a method which is often veryeffective but has some important limitations. With a small sample,diagnosis can be very difficult and often a large number of cells willnot be available, or it is not desirable or possible to obtain a largesample from the patient. In as many as 50% of cases a reliable diagnosiscannot be given; it may be that there is no positive evidence ofcarcinoma but also no certainty that the patient is actually free fromcarcinoma. More invasive investigation is then required to establish adiagnosis.

Judgment of prognosis also relies on the appearance of cells when viewedunder a microscope. Generally, the more bizarre-looking the cells in aprimary tumour, the more likely they are to metastasise later on but thecorrelation is by no means absolute. It would clearly be an advantage tobe able to predict more accurately whether or not metastasis is likelyto occur in order to judge what will be the most effective treatment.

The human CD44 gene codes for a family of variably glycosylated cellsurface proteins of different sizes, the numerous functions of which arenot yet fully established, but which share epitopes recognised by theCD44 monoclonal antibody (mAb). It is known to consist of a standardportion which is expressed in haemopoietic cells and many other celltypes and into which the products of additional exons may be spliced invarious combinations to produce different proteins. This is a wellrecognised mechanism in eukaryotes for producing several oftenfunctionally unrelated proteins from the same gene, and is known asalternative splicing.

Two common CD44 isoforms have so far been purified and characterised(Stamenkovic et al. 1989), namely i) a 90 kD form consisting of acentral 37 kD core which is heavily glycosylated and ii) a 180 kD formwhich has 135 extra amino acids inserted into the proximalextra-membrane domain and is even more heavily glycosylated.Immuno-cytochemical and immuno-precipitation studies have shown thatboth are widely distributed in many different cells and tissues. Theformer is known as the haemopoietic or standard form which is present oncirculating leukocytes, bone marrow cells and numerous other cell types.The other, known as the epithelial variant, is detectable on severalepithelial cell types. Both are believed to function as receptorsmediating homotypic and heterotypic adhesive interactions, attachingcells to each other or to adjacent extracellular scaffolding.

Some time ago, some of the CD44 epitopes recognised by the mAb Hermes-3were identified as constituting the peripheral lymph node receptorenabling circulating lymphocytes to recognise and traffic throughperipheral lymph nodes. Further mAbs to this antigen later becameavailable and Stamenkovic et al. (1989) used one of them to clone a cDNAsequence coding for the standard form of the molecule from an expressionlibrary in COS cells. They additionally found, by Northern blotting,that this gene was expressed not only by lymphoid cells, but also by avariety of carcinoma cell lines and a representative sample of solidcarcinomas, amongst which two colonic carcinomas appeared to expressmore than normal colonic epithelium.

Birch and colleagues (1991) reported that melanoma cell clones whichstrongly expressed the 80-90 kD form of the CD44 antigen, recognised bythe Hermes-3 antibody, were substantially more metastatic in nude micethan clones which expressed it weakly. Sy et al. (1991) described amoderate increase in metastatic capability of human lymphoma cells innude mice, after the cells were transfected with the standard CD44 gene,but not after transfection with a construct coding for the epithelialvariant. Gunthert et al. (1991) obtained results indicating that avariant form of the lymphocyte homing receptor, recognised by a newantibody raised to the rat CD44 antigen, is required for metastaticbehaviour of rat pancreatic adenocarcinoma cells. Using this antibodythey cloned a cDNA sequence corresponding to the variant form of CD44and found that it contained previously unidentified exons. Transfectionof a non-metastatic clone from the same cell line with a constructdesigned to over-express this cDNA sequence unique to the metastaticcounterpart, appeared to induce metastatic behaviour (Gunthert et al,1991).

In view of these findings it became of interest to know whether othercultured metastatic and non-metastatic human tumour cell lines, ofvarious histogenetic origins, expressed CD44 products differentially.The expression of genes in cells or tissues can be studied mostefficiently and sensitively by making cDNA from cellular messenger RNAand amplifying regions of interest with the polymerase chain reaction,using specific oligonucleotide primers chosen to anneal preferentiallyto portions of the cDNA corresponding to the gene products. However,subsequent work by Hofmann et al. (1991) and the present applicantsusing this approach provided results which showed that CD44 expressiondid not regularly and reliably correlate with the metastatic capabilityor even tumour forming ability of these cultured cell lines in nudemice. At about this time, three separate groups (Hofmann et al, 1991,Stamenkovic et al, 1991 and Jackson et al, 1992) published sequence dataon further splice variants they had found being expressed by this genein various human cell lines.

THE INVENTION

The present invention results from a surprising discovery resulting fromstudies examining the expression of various parts of the CD44 gene infresh tissue and body fluid samples from patients with tumours of thebreast and colon and from their metastases. The results indicate sharpand clear differences in CD44 expression between tissues from i)metastatic (malignant) tumours, ii) non-metastatic locally invasivetumours and benign tumours and iii) normal tissue. The distinctionbetween groups i) and ii) is important for judgment of therapy and thatbetween groups ii) and iii) is important for early diagnosis andscreening.

The invention therefore provides in one aspect a method of diagnosis ofneoplasia, which method comprises analysing the expression of the CD44gene in a sample.

In a particular embodiment, the invention provides a method of assayinga sample for products of the CD44 gene or part thereof which methodcomprises making cDNA from messenger RNA (mRNA) in the sample,amplifying portions of the complementary DNA (cDNA) corresponding to theCD44 gene or part thereof and detecting the amplified cDNA,characterised in that the amplified cDNA is used in diagnosis ofneoplasia.

The diagnosis of neoplasia may refer to the initial detection ofneoplastic tissue or it may be the step of distinguishing betweenmetastatic and non-metastatic tumours. References to the term"diagnosis" as used herein are to be understood accordingly.

The method is particularly applicable to the diagnosis of solid tumoursparticularly malignant tumours e.g. carcinomas. The sample on which theassay is performed is preferably of body tissue or body fluid; and notof cells cultured in vitro. The sample may be a small piece of tissue ora fine needle aspirate (FNA) of cells from a solid tumour.Alternatively, it may be a sample of blood or urine or another bodyfluid, a cervical scraping or a non-invasively obtained sample such assputum, urine or stool.

The cDNA may be detected by use of one or more labelled specificoligonucleotide probes, the probes being chosen so as to be capable ofannealing to part of the amplified cDNA sequence. Alternatively,labelled oligonucleotide primers and/or labelled mononucleotides couldbe used. There are a number of suitable detectable labels which can beemployed, including radiolabels.

BRIEF DESCRIPTION OF THE DRAWINGS

Reference is directed to the accompanying drawings, in which:

FIGS. 1A, 1B, 1C, 2A, 2B, 2C, 3, 4, 5A, and 5B are autoradiographsshowing the results of various experiments reported below,

FIG. 6 is a map of the CD44 gene showing exons, probes and primers. Thenumbering of the exons corresponds to that used by G. R. Screaton et al.1992),

FIG. 7 is the nucleic acid sequence of Exon 6 (shown in FIG. 6) (SEQ IDNO:1), the corresponding amino acid sequence being also shown (SEQ IDNO:2), and

FIG. 8 is a set of autoradiographs showing the results of anotherexperiment.

FIG. 6 is a map of the CD44 gene showing exons 6 to 14. The basic orstandard protein can theoretically be modified by the insertion oftranscripts from any, some, or all of these 9 extra exons. Exon 6 wasunknown at the priority date of this patent application, and constitutesa further aspect of the invention. Exon 6 is over-expressed in tumoursbut not in normal tissues, and is located in the vicinity of exons 7 to9. The sequence of exon 6 is given in FIG. 7. It contains 129 base pairsand is flanked on the 5'-side by the standard CD44 sequence, and on the3'-side usually by exon 7.

In contrast to Exons 9 to 11, the products of Exon 6 (thenewly-sequenced Exon) are only barely detectable in samples of normaltissues. This suggests that Exon 6 will be of particular value in thediagnosis of neoplasia.

In another aspect, this invention provides as new compounds, the nucleicacid sequence of Exon 6 as shown in FIG. 7, characteristic fragmentsthereof, sequences which are degenerated and/or represent allelevariations, the homologous nucleic acid sequences, and probes, primersand other reagents capable of hybridising with the sequences orhomologues. These compounds and reagents will all be useful in themethod described above.

Further aspects of this invention include:

The peptide sequence, corresponding to Exon 6 and shown in FIG. 7, itsallele variations and secondary modifications thereof, likephosphorylation and glycosylation products, and characteristic fragmentsthereof, for example those fragments which constitute epitopes when thepolypeptide is folded in vivo.

Antibodies to the peptide sequence, its allele variations and secondarymodifications thereof, like phosphorylation and glycosylation productsand the characteristic fragments thereof. Such antibodies may belabelled, for example with a radionuclide or with a tumouricidalcompound.

Use of a labelled antibody for in vitro diagnosis.

Use of such a radiolabelled antibody for radioimaging or in vivodiagnosis.

Use of the antibodies, optionally labelled with tumouricidal compoundsor otherwise, in therapy.

The peptide sequence or fragment can be synthesised by standardtechniques, e.g. using an automatic synthesiser. The antibodies can bemade by administering the peptide in antigenic form to a suitable host.Polyclonal or monoclonal antibodies may be prepared by standardtechniques.

In a further aspect, the invention provides a method for theimmunological diagnosis of neoplasia, characterised by determiningover-expression of an exon located in the vicinity of exons 7 to 9 ofthe CD44 gene. Preferably the exon has the nucleic acid sequence shownin FIG. 7.

The chaotic over-expression of multiple spliced variants of the CD44gene in tumours, implies that a particular exon may or may not beover-expressed (or expressed at all) by a particular tissue sample. Animmunoassay using an antibody to the peptide expressed by any singleexon may therefore give misleading results. This invention thereforeincludes use, for the immunological diagnosis of neoplasia, of a mixtureof antibodies to two or more, and preferably to all nine, of the CD44exons.

DETAILED DESCRIPTION

In one embodiment of the invention, the amplification of cDNA is carriedout using the polymerase chain reaction (PCR). For PCR, primers may bechosen using known sequence information for human CD44 cDNA. Primers maybe used that amplify cDNA corresponding to any part of the CD44 genethat may be expressed. This may include the standard portion with orwithout the inserted exons, or it may be part or all of one or more ofthe exons only. In the latter case there would be less wastage ofreagents and a better signal produced, and a probe for the standardsequence would not be used.

The invention is not limited to the use of straightforward PCR. A systemof nested primers may be used for example. Other suitable amplificationmethods known in the field can also be applied.

In another method according to the invention, the amplified cDNA isseparated by electrophoresis. Blotting and autoradiography may then beperformed on the separated cDNA. Autoradiography involves probingelectrophoresed amplified products, immobilised by blotting them on to anylon membrane, with a radiolabelled specific oligonucleotide probelabelled with ³² P or other suitable label, the probe being chosen so asto be capable of annealing to part of the amplified cDNA sequence. Thedetection step then involves exposure of the labelled, separated cDNA toX-ray film.

In the examples which follow it was found that expression of the humanCD44 gene was consistently and distinctively increased in various solidtumours relative to normal tissues. Malignant (i.e. already metastatic)tumours differed from locally invasive and benign ones in the patternand magnitude of changes seen. The study was performed on samples from46 tumours of which 44 were locally invasive, or metastatic and 2 werebenign. Analysis of CD44 expression was performed by using PCR toamplify cDNA made by reverse transcription of RNA extracted from freshsurgical biopsy samples. By choosing oligonucleotide primers whichspecifically anneal to certain portions of the CD44 gene, it is possibleto amplify portions of the gene which, from these results, are ofdiagnostic and prognostic interest.

The strong association found here, between altered CD44 expression andneoplasia, need not imply that any of the individual exons of the geneare expressed only in neoplasia or in progression to metastaticmalignancy. Evidence accrued in many laboratories in recent years (seeKnudson 1985, Tarin 1992, Hayle et al 1992 for reviews) indicates thatthese pathological processes are probably the consequences of disturbedregulation of genes coding for normal cellular activities such as cellproliferation and migration. Therefore it seems unlikely that any gene,or portion of a gene, has the sole function of programming neoplasia ormetastasis.

The finding in the present study of transcripts from exon 10/11 innormal tissues, indicates that this exon is not exclusively concernedwith metastatic activity, even though there is marked increase in thenumber and signal intensity of bands hybridising with radiolabelledprobe E4 in the PCR products from tumours capable of metastasis. Othersupporting events are therefore believed to be required for CD44 exon10/11 expression to result in metastatic behaviour. Nevertheless, theobservation that transcripts from this exon were over-expressed insamples from metastatic tumours promises to be a very useful indicatorof prognosis.

It is not expected that further research will find that the natural(non-mutated) products of any individual exon will be uniquely presentin tumour cells and not in normal counterparts. Instead, it is likelythat an abnormal pattern of gene activity consisting of over-expressionand inappropriate combination of products of a gene, such as thatreported here for the CD44 locus, could play a part in malignancy. Thesechanges may themselves be required for malignant conversion, or be theconsequence of other genetic disturbances causing such a conversion.Even so, without resolving this issue, an observer using thesetechniques can obtain information relevant to assigning a sample toneoplastic or non-neoplastic categories.

EXAMPLES Method

Fresh tissue samples, 0.5-1 cm diameter, were obtained from surgicalresection specimens removed at therapy of 34 patients with breasttumours and colon tumours. The samples were snap-frozen in liquidnitrogen within ten minutes of arrival in the pathological specimenreception area and kept in liquid nitrogen until use. Portions of lymphnode metastases and blood-borne metastases were also collected, ifpresent, in the tissue resected for diagnosis. Normal breast tissue,normal colon mucosa, normal lymph node adjacent to the tumour in thebreast and normal liver were also collected from the surgically resectedsamples and from other samples removed for non-neoplastic conditions.Normal peripheral blood leukocytes were obtained from 10 volunteers andbone marrow from 3 volunteers. The histological features of the tumoursand their clinical stages were as described in Table 1.

Total cellular RNA extraction from tissue samples was performedaccording to the method described by Chomizynski and Sacchi (1987).Extraction from fluid samples was by use of the Microfasttrack kitmarketed by Invitrogen. cDNA synthesis and subsequent amplification bythe polymerase chain reaction (PCR) was performed using the Superscript™preamplification system (BRL Life Technologies Inc., Middlesex, UK) withbuffers and reagents supplied in this kit. In brief, this involves aninitial step of first strand cDNA synthesis with reverse transcriptase,using sample RNA as the template and supplied nucleotide triphosphates.For subsequent PCR each sample was overlaid with oil and heated at 94°C. for 5 minutes to denature nucleic acid; 30 cycles of PCR were thenconducted with the following cycle parameters: 94° C. for 1 m, 55° C.for 1 m, 72° C. for 2 m. Negative controls in which there was notemplate cDNA in the reaction mix, were routinely run with each batch.The primers and probe sequences we devised, using information from thepublished sequence for human CD44 cDNA (Hofmann et al, 1991, Stamenkovicet al, 1991, Jackson et al, 1992) (FIG. 6) were as follows:

    P1=5'GACACATATTGCTTCAATGCTTCAGC (SEQ ID NO:3)

    P4=5'GATGCCAAGATGATCAGCCATTCTGGAAT (SEQ ID NO:4)

P1 is located with its origin 324 bp upstream from the insertion site inthe standard CD44 molecule (between nucleotides 782 and 783 in thesequence published by Stamenkovic et al, 1989) and P4 is 158 bpdownstream of this site. These primers produce a PCR fragment of 482 bpif a sample expresses standard CD44 (so-called haemopoietic CD44), 878bp for the epithelial form of CD44 and several other bands, if a samplecontains alternatively spliced transcripts. 10 μl of each PCR productwas electrophoresed in a 1.2% agarose gel and transferred to Hybond N+(Amersham UK, Little Chalfont, UK) nylon membranes for hybridisationwith oligonucleotide probe E4 (=5'TGAGATTGGGTTGAAGAAATC-3'), (SEQ IDNO:5) see FIG. 6. Blotting and autoradiography were performed to improvesensitivity of detection and resolution. The probe was radiolabelledwith y³² P-ATP in the presence of polynucleotide kinase. Afterprehybridisation, hybridisation was performed in 10% dextran, 6×NET,5×Denhardt solution, 0.5% NP40 and 100 μg/ml salmon sperm DNA at 42° C.overnight. The filter was then washed twice in 2×SSC, 1×SSC and 0.5% SSCwith 0.1% SDS at 42° C. sequentially for 15 minutes each. Filters wereexposed to Kodak X-ray film for 2-16 hours. After this, the filters wereboiled in 0.5% SDS for stripping the probe and rehybridised with anotherradiolabelled probe, namely P2 (=5'CCTGAAGAAGATTGTACATCAGTCACAGAC) (SEQID NO:6) we designed to anneal to the standard portion of the CD44 (FIG.6). The conditions used for hybridisation, washing and autoradiographywere the same as above.

Calibration of the sensitivity of the method, for detection of smallnumbers of cells was performed as follows: total peripheral bloodleukocytes (PBL) were purified from 20 ml whole blood by lysis of packedred blood cells by addition of ammonium chloride buffer (1 ml packedcells to 50 mls lysis buffer) and subsequent centrifugation 15 minuteslater. The white cell pellet was divided into 4 tubes which were seededrespectively with 0 μl, 1 μl, 10 μl and 100 μl of a suspension of HT29colon carcinoma cells (5000 cells per ml). Total RNA was then extractedand each tube yielded approximately 20 μg. cDNA synthesis was performed,as described above on 4μg aliquots of the RNA obtained from each tuberepresenting 0, 1, 10 and 100 tumour cells per aliquotted samplerespectively. The PCR was performed on these samples and on positive(tumour cells only) and negative (no DNA) controls using primers D1 andD5 which were designed by us to anneal specifically to exons 7 and 14 inFIG. 6. We know from previous studies that HT29 cells express bothexons, and others, in a pattern easily distinguishable from PBL andchose the oligonupleotide primers D1 and D5 because we wished toincrease sensitivity by shortening the segment to be amplified. It wasalso reasoned that use of these primers would circumvent the problem ofusing primers P1 and P4 for this specific purpose because the majorityof these would be soaked up by annealing to the standard portion of thegene. PCR cycle parameters, blotting, probing and washing conditionswere as described above. The oligonucleotide sequence used for probingwas ³² P labelled E4.

General Overview of Results

As the primers (P1 and P4) amplify across the splice product insertionsite it is clear that the intervening part of the standard molecule willbe amplified, in addition to any alternatively spliced variants whichcontain transcripts from the additional exon domains. Hence the totalnumber of products which could conceivably be detected with a probe(e.g. P2) to the standard form considering all possible combinations ofthe sequences identified from this locus, is large. Using probe E4, 16of these combinations, namely those containing E4 transcripts from exon11, could potentially be visualised as bands of different molecularsizes resolved by electrophoresis. In practice the full range ofpossible combinations was not detected in these results, but several (upto 9) alternative splice variants were seen in neoplastic tissueshybridised with each probe. Normal tissues from the breast, colon andlymph nodes did express some E4-containing transcripts (FIGS. 1 and 3),in addition to the standard molecule (FIGS. 2 and 4), but peripheralblood leukocytes (FIG. 5) and liver (FIG. 4) detectably expressed onlythe latter with this combination of probes and primers. The details arepresented below:

EXAMPLE 1 Breast Tissue Samples

The results obtained in the study of breast tissue samples areillustrated in FIGS. 1 and 2. Metastatic tumour deposits and theircorresponding primary tumours from all cases over-expressed severalalternatively spliced products containing transcripts from exon 11 (FIG.1a). At least 8 separate bands were frequently seen together with aconsistent doublet at 1500 bp and 1650 bp present in all tumours. Normalbreast tissue and normal lymph node produced two bands (1150 bp and 860bp) with this probe. The doublet mentioned above was not seen in anynormal sample.

The differences between the number, and size of the bands and theintensity of signal from the bound probe, between tissues in normal andmalignant categories, was obvious in all samples examined. Foroccasional samples it was necessary to expose the filter to the X-rayfilm for longer, to see the distinctive differences, but this findingwas confirmed in every case studied.

Samples from locally invasive tumours with no associated clinicalevidence of metastasis and from the two fibroadenomas alsoover-expressed splice products containing transcripts from exon 10/11relative to normal tissues, but-the extent of this was easilydistinguished from the results obtained with malignant tumours and theirmetastases. Distinction from the patterns seen in normal tissues wasalso easy (FIG. 1b). However, a single sample gave a similar result tomalignant tumours (lane 14) (see below). The two fibroadenomas showedband patterns that were similar to those from non-metastatic carcinomasand the sample from a case of cystic disease of the breast resembled thepattern for normal non-neoplastic breast tissue. This is the firstinstance of definitive diagnosis by this method. The piece of tissue wasprovided by the duty pathologist as being from a benign tumour, namely afibroadenoma, on macroscopic appearance at initial inspection with thenaked eye. It was then characterised as definitely non-neoplastic afterPCR amplification of its cDNA, and subsequent microscopical examinationof the tissue confirmed this.

To confirm that the differences seen with probe E4 are valid and nottechnical artifacts, the results obtained when the same filter washybridised with probe P2 are shown in FIG. 2. This shows that i) alltissues examined expressed the standard form of the gene, ii) other exonsplice products, not containing transcripts from exon 10/11, werepresent in tumours and metastases and iii) that the differencesdescribed above are not due to unequal loading of tracks in the variouspanels and lanes on this composite filter, but resulted from alternativesplicing. All conditions in this experiment were the same as those inhybridisation with E4, except the exposure time of the filter to X-rayfilm (10 hours exposure for FIG. 1, versus 1.5 hours for FIG. 3).

EXAMPLE 2 Colon Samples

The findings in colon carcinoma were identical to those in breastcarcinoma. Thus, in all cases the colon carcinoma tissues showedincreased number of more intensely labelled, larger molecular weightbands with probe E4 (FIG. 3) than normal colonic mucosa and other normaltissues. As with breast carcinomas, hybridisation with probe P2 showedno differences in the degree of expression of the standard form of themolecule (FIG. 4).

EXAMPLE 3 Calibration of the Sensitivity of the Method

Examination of autoradiograms of PCR products of peripheral bloodleukocytes seeded with known numbers of HT29 colon carcinoma cellsshowed the presence of additional bands characteristic of tumour cells,down to a level of 10 tumour cells in a sample of 10⁷ leukocytes. Byfine-tuning the conditions of the assay it is considered possible todetect a single tumour cell in 10 ml of blood.

In the series described above, all samples of neoplastic tissue showedover-expression of alternatively spliced products of the CD44 gene andnone of the samples from non-neoplastic tissue did so. Therefore, therewas complete correspondence between normal or neoplastic origin of asample and pattern of CD44 expression. In one instance, a tumour removedfrom a patient (patient B16, lane 14 in FIG. 1A) with no currentclinical evidence of metastasis, was found to have a pattern ofexpression indicating metastatic capability. At present it is notpossible to know whether this is a false positive result, or a sign ofimminent metastasis. This patient is currently under observation in thefollow-up clinic.

EXAMPLE 4

We have designed and synthesised oligonucleotide primers according toour current findings, as follows:

Primer P1=5'-GACACATATTGCTTCAATGCTTCAGC (458-484) (SEQ ID NO:3)

Primer P2=5'-CCTGAAGAAGATTGTACATCAGTCACAGAC (488-518) (SEQ ID NO:6)

Primer P3=5'-TGGATCACCGACAGCACAGAC (746-767) (SEQ ID NO:7)

Primer P4=5'-GATGCCAAGATGATCAGCCATTCTGGAAT (912-941) for standard part(Stamenkovic 1989) (SEQ ID NO:4)

Primer E1=5'-TTGATGAGCACTAGTGCTACAGCA (SEQ ID NO:8)

Primer E2=5'-CATTTGTGTTGTTGTGTGAAGATG (SEQ ID NO:9)

Primer E3-5 ''-AGCCCAGAGGACAGTTCCTGG (534-554) (SEQ ID NO:10)

Primer E4=5'-TGAGATTGGGTTGAAGAAATC (558-578) (SEQ ID NO:5)

Primer E5=5'-TCCTGCTTGATGACCTCGTCCCAT (585-608) (SEQ ID NO:11)

D1: 5'GAC AGA CAC CTC AGT TTT TCT GGA (63-86) (SEQ ID NO:12)

D5: 5'TTC CTT CGT GTG TGG GTA ATG AGA (888-911) (SEQ ID NO:13) for theexons (Hofmann 1991). E1 and E2 are on exon 6.

Fresh tissue samples 0.5-1 cm in diameter were obtained from surgicalresection specimens or at autopsy. All samples used in this work wereobtained from the residue of tissue remaining after diagnostic sampleshad been taken, and which would otherwise have been discarded. Thesamples were snap-frozen in liquid nitrogen within ten minutes ofarrival at the pathological specimen reception area and kept frozen innitrogen until use. cDNA was synthesised with viral reversetranscriptase using 5 μg of total cellular RNA as template, followed byPCR with Primer P1 and Primer P4. PCR amplification, electrophoresis andhybridisation were performed under standard conditions.

When the PCR products were hybridised with radiolabelled E2 or E4, allsamples from carcinomas over-expressed several splice variants, but thepattern of bands seen with each probe was different. Hence, theoligonucleotide probe for Exon 6 products is very effective indistinguishing neoplastic from non-neoplastic samples, but notsignificantly more sensitive than E4, at least on samples from solidtissues, but is possibly useful for detecting organ of origin of adisseminating metastatic cell or an established metastasis.Subsequently, the same filters were stripped and hybridised with P2probe to show that all samples, including normal tissues, produced thestandard portion of CD44. This confirmed that the differences observedbetween the results obtained with normal and tumour samples, probed withE2 and E4, were not due to unequal loading of PCR products. Thecumulative results are summarised in Table 1 which indicates that thesechanges are seen in a wide range of common cancers.

                  TABLE 1                                                         ______________________________________                                                      No. of Patients/                                                                          No. Showing Increased                               Type of Tissue                                                                              Volunteers  Splice Variants                                     ______________________________________                                        Neoplastic    47          46                                                  Breast Cancer 21          21                                                  Colon Cancer  13          13                                                  Bladder Cancer                                                                              6           6                                                   Stomach Cancer                                                                              1           1                                                   Thyroid Cancer                                                                              1           1                                                   Fibroadenoma  2           2                                                   Prostate Cancer                                                                             3           2                                                   Non-Neoplastic                                                                              39          0                                                   Normal Breast 9           0                                                   Cystic Disease of Breast                                                                    1           0                                                   Normal Colon  9           0                                                   Crohn's Disease                                                                             1           0                                                   Ulcerative Colitis                                                                          1           0                                                   Appendicitis  1           0                                                   Normnal Bladder                                                                             4           0                                                   PBL           10          0                                                   Bone Marrow   3           0                                                   ______________________________________                                    

We have also examined some malignant tumours of bone muscle and observeda similar pattern, of marked over-expression of multiple splicedvariants, in the osteosarcoma.

EXAMPLE 5 Cancer Diagnosis by PCR Assay of Clinically-Harvested UrineSamples

Approximately 50 ml naturally-voided urine were obtained from eachperson and transported to the laboratory as speedily as possible.Specimens from 90 patients were examined: 44 from patients withbiopsy-proven bladder cancer, 46 from patients with non-neoplasticinflammation of the bladder (cystitis) and from normal volunteers. Oneml of each urine sample was removed after thorough mixing and submittedfor cytological examination. Another 1 ml of urine was checked byFluorescein diacetate-ethidium bromide staining to assess the viabilityof cells in the sample. The remainder of the urine was centrifuged at2000 rpm for 10 minutes and the cell pellet was kept at -70° C. untiluse. mRNA extraction was performed with oligo dT cellulose tablets(invitrogen). cDNA was synthesised with AMV reverse transcriptase(Invitrogen). The completed cDNA solution was divided equally into twotubes, one being for PCR with E1 and E5, to amplify the particular cDNAtranscript, which we have found to be of diagnostic value and the otherfor PCR with P1 and P4 to amplify the standard form of CD44, with orwithout all splice variants, as an internal control.

Thirty-five cycles PCR were then carried out. The cycle conditions were:95° C. 1 minutes, 55° C. 1 minute, 72° C. 2 minutes. A hot startprocedure was adopted for all samples. Results are shown in FIG. 8.

Equal volumes of PCR products were loaded in each lane of a 1.2% agarosegel and stained with ethidium bromide. If the cells in the urine were tobe expressing all the Exons from Exon 6 to Exon 14, it was predictedthat with the current PCR protocol, using primers E1 and E4, shouldproduce a 735 bp band. There is no band in tracks containing cDNA fromnormal urine or that of patients with non-neoplastic cystitis (lanes1-8) but a clear 735 band is seen in all urine samples from patientswith bladder cancer (lanes 9-16) when PCR was performed with primer E1and E5 (upper panel).

A 482 bp band representing the standard form of CD44 was obtained almostequally in all cases when PCR was performed with P1 and P4 (lowerpanels). This indicates that the diagnostically significant differencesbetween urine from patients with bladder cancer and that from controlswere not caused by unequal loading of the tracks but by alternativesplicing of the CD44 gene. Lanes 1-4: normal urine. Lanes 5-8: cystitisurine. Lanes 9-16: from patients 1-8 with bladder cancer.

In the overall results this 735 bp band was completely absent in 7 of 7normal and 9 of 9 cystitis-affected urine specimens; that is 0% falsepositive. Also 14 of 19 (74%) urine samples from patients with bladdercancer showed a positive result (i.e. 26% false negatives). In the falsenegative samples there was a shortage of viable cancer cells asindicated by fluorescein-d acetate ethidium bromide staining.

EXAMPLE 6

Stools from 12 patients were assayed by the techniques described herein.Of the samples from 9 patients with colorectal carcinoma, 5 gavepositive results. Of the samples from 3 normal patients, all 3 gavenegative results. These figures, obtained from samples full of bacteriawhich were not subjected to any pretreatment, encourage the belief thata viable diagnostic assay could be developed without difficulty.

In the inventors further experience of detecting tumour cells with thismethod, the following observations would be useful to othersinvestigating its diagnostic potential. The major considerations to beaware of are that the reliability and reproducibility of the resultsdepend critically on the quality of the mRNA obtained from the sampleand upon the care with which the techniques are performed. The mainrequirement is to eliminate false negative results by ensuring that highquality mRNA is routinely obtained and by using internal standards inevery reaction to monitor the PCR amplification step. False positives,providing they are not too frequent, are not a serious problem, becausethey can be recognised by replicate assays on the same or furthersamples and by reference to other clinical data.

The inventors have explored the procedures needed to ensure the routineRT-PCR detection of abnormal CD44 gene activity in small clinicalsamples containing tumour cells. If a tissue sample is divided intoaliquots, half of which are frozen in liquid nitrogen immediately andthe remainder of which are left at ambient temperature, one can show howthe ability to detect CD44 splice variants declines with time and withmode of specimen handling. Fresh samples submitted to mRNA extractionwithin half an hour of excision give the most reliable results and thereis a gradual decline in quality over the next few hours if the freshtissue is left at ambient temperature. If the sample is first snapfrozen, the results obtained when RNA is extracted immediately afterthawing are satisfactory, but decline very rapidly, beginning within 15minutes, the larger variant transcripts being lost first and ultimatelyeven the standard form. It is also found that if snap-frozen cell andtissue samples are stored at -70° C. the results decline after 4 weeks,even if the mRNA is extracted immediately after thawing. It would seemtherefore that degradation of RNA by ribonucleases released from cellsruptured during freezing continues, even at this temperature, althoughat slower rates. Further, as one would expect, if the sample taken forRNA extraction is from an area of necrosis or of fibrosis, one does notobtain the typical results seen with viable tumour tissue. Hence, carein sample selection and in specimen processing are both needed forgenerating reliable data.

Arising out of this, we prefer that a fresh sample should be held fornot more than 24 hrs before being either frozen or treated to extractmRNA; and that a thawed sample should be held for not more than 2 hrsbefore being treated to extract mRNA.

The diagnosis method described herein can be performed in a single day,possibly in a few hours, and is capable of being automated. Use of themethod has been demonstrated, on various tissue samples to detect awhole variety of cancers, and also on blood and urine samples. Wetherefore offer it as a convenient practical method for cancer screeningand diagnosis. In principle it could also have wide generalapplicability to cancer detection and prevention programmes andtherefore have epidemiologic and public health value. Proper applicationof its sensitivity, specificity and simplicity should add not only toinitial cancer diagnosis but to evaluation of extent of disease in thebody, to judgment of the efficacy of treatment and to early detection oftumour recurrences.

Figure Legends

Notation: N=normal, T=primary tumour, M=metastasis.

FIGS. 1A-1C

Autoradiogram of PCR products from breast tissue samples probed with E4(10 hours exposure of X-ray film to sample filter). Panel A: malignantprimary breast carcinomas with their metastases. Tracks 1, 2 and 3:patient B1; tracks 4, 5 and 6: patient B2; tracks 7, 8 and 9: patientB3; tracks 10 and 11: patient B4; tracks 12 and 13: patient B5. It canbe seen that compared to normal breast tissue, primary breast carcinomasand their metastatic deposits overexpress several splice-variants. Notethe doublet (arrows) at 1500 bp and 1650 bp best seen in track 5. Thisis present in all tumours and metastases but is fogged in the othertracks by this time of exposure. It is not detectable in any normalsamples even at much longer exposure times (23 hours). Panel B: Breastcarcinomas with no clinical evidence of metastasis. Tracks 14-20 arefrom patients B15-B21. The tumours all overexpress several variants, butshow less bands and the signal intensity is less, except track 16(patient B17)--see text. The 1500/1650 bp doublet (arrow) is easilyrecognisable in tracks 15, 16 and 18 at this length of exposure andbecame detectable in all other tumour-containing tracks on longerexposure. The illustration, however, shows only the shorter exposure, toavoid fogging the tracks which have stronger signals. Panel C:Fibroadenomas (FA) and fibrocystic disease of the breast (Cyst). Tracks21 and 22, containing the benign tumour samples (samples B22 and 23),express more than the non-neoplastic sample (fibrocystic disease) intrack 23 (sample B24).

FIGS. 2A-2C

Autoradiogram of PCR products from breast tissue samples probed withprobe P2 (1.5 hours exposure of X-ray film to sample filter). Thisresult was obtained by reprobing the same filter as that used in FIG. 1,after stripping off the previous probe. Here it can be seen that i) thedifferences observed in FIG. 1 are not due to unequal loading of tracks,ii) that the expression of the standard form of the molecule isquantitatively greater than any of the variants, iii) the standard formis expressed in all tissues examined and iv) further variants which donot contain exon 3 transcripts, are also present and over-expressed intumours. The 1500/1650 bp doublet can be recognised in the tumours inpanel A but needed longer exposure to be detectable in panels B and C.

FIG. 3

Autoradiogram of PCR products from colon tissue samples probed with E4(10 hours exposure of photographic film to sample filter). Tracks 1, 2and 3: patient C1; tracks 4, 5 and 6: patient C2; tracks 7, 8 and 9:patient C3; tracks 10 and 11: patient C4; tracks 12 and 13: patient C5;track 14: normal liver sample. The picture shows the same features asdescribed in the legend to FIG. 1 and that the findings apply tocarcinomas of the colon. The 1500/1650 bp doublet (arrow) is easilyrecognisable in several tumour tracks (2 and 8-12) and the faint signalin the corresponding position in tracks 3, 5, 6 and 13 became strongeron longer exposure. However none appeared in this vicinity in tracks 1,4, 7 or 14 (normal tissue).

FIG. 4

Autoradiogram of PCR products from colon tissue samples probed with P2(1.5 hours exposure of photographic film to sample filter) . Thisconfirms equal loading of the tracks and that other points, illustratedin FIG. 2, apply to colon carcinomas. Note that normal liver expressesthe standard form of CD44.

FIGS. 5A-5B

Autoradiogram of PCR products of normal peripheral blood leukocytes, PBL(from 3 different persons) and other normal tissues probed with E4(panel A; 8 hours exposure to photographic film) and P2 (panel B; 5hours exposure to photographic film). Track 6 contains PCR products froma breast cancer (patient B1) as a positive control. With thiscombination of primers and probes, leukocytes can be seen to express thestandard form of the CD44 molecule, but no detectable splice variants.The samples in tracks 4 and 5 were from individuals with no clinicalevidence of neoplasia, as follows: track 4, breast tissue obtained atautopsy from the body of a woman who died of bacterial endocarditis;track 5, colon resected for volvulus.

                                      TABLE 2                                     __________________________________________________________________________                                    HISTOLOGY                                     PATIENT                                                                            AGE                                                                              DISEASE                                                                              TUMOUR SIZE                                                                           METASTASIS                                                                             (GRADE)  CLINICAL STAGE                       __________________________________________________________________________    B1   56 Breast ca                                                                            2.5 cm  Lymph node                                             B2   53 Breast ca                                                                            3   cm  Lymph node                                             B3   65 Breast ca                                                                            3   cm  Lymph node                                             B4   54 Breast ca                                                                            5   cm  Lymph node (10/10)                                                                     IDC (mucinous)  1!                            B5   59 Breast ca                                                                            5.5 cm  Lymph node                                             B6   59 Breast ca                                                                            3   cm  Lymph node                                             B7   61 Breast ca                                                                            4   cm  Lymph node (17/17)                                                                     ILC/IDC  3                                    B8   38 Breast ca                                                                            3.5 cm  Lymph node (1/5)                                                                       ILC      2                                    B9   65 Breast ca                                                                            1.8 cm  Lymph node (5/6)                                                                       ILC      2                                    B10  61 Breast ca      Lymph node (10/13)                                                                     IDC  1!  2                                    B11  80 Breast ca                                                                            11  cm  Lymph node                                                                             3                                             B12  65 Breast ca                                                                            2.3 cm  Lymph node                                                                             ? 1                                           B13  68 Breast ca                                                                            2.8 cm  Lymph node (4/12)                                                                      IDC  3!  2                                    B14  47 Breast ca                                                                            7   cm  Lymph node        2                                    B15  38 Breast ca      None (0/7)                                                                             IDC      1                                    B16  62 Breast ca                                                                            3   cm  None (0/4)                                                                             IDC  3!  1                                    B17  62 Breast ca                                                                            3   cm  None (0/16)                                                                            IDC  2!  1                                    B18  63 Breast ca                                                                            3   cm  None  0/6!                                                                             1                                             B19  61 Breast ca                                                                            3   cm  None     1                                             B20  42 Breast ca                                                                            4   cm  None     IDC      1                                    B21  65 Breast ca      Lymph node                                                                             IDC/ILC                                       B22  54 Breast ca                                                                            6   cm  None (0/4)                                                                             IDC      1                                    B23  49 Fibroadenoma                                                                         4   cm  --       --       --                                   B24  47 Fibroadenoma                                                                         3   cm  --       --       --                                   B25  29 Cystic disease                                                                       --      --       --       --                                   C1   72 Colon ca                                                                             5.0 cm  Lymph node                                                                             Well diff. adeno                                                                       3  C!                                C2      Colon ca       Lymph node                                             C3   65 Colon ca                                                                             6.5 cm  Liver    Mod diff. adeno                                                                        4  D!                                C4   56 Colon ca                                                                             7.8 cm  Lymph node                                                                             Mod diff. adeno                                                                        4  D!                                                       (and liver)                                            C5      Colon ca       Lymph node                                             C6   57 Colon ca                                                                             5   cm  Lymph node                                                                             Mod diff. adeno                                                                        3  C!                                C7      Colon ca       None                                                   C8   75 Colon ca                                                                             6.5 cm  Lymph node                                                                             Mod diff. adeno                                                                        3  C!                                C9   72 Colon ca                                                                             5.5 cm  Lymph node                                                                             Mod diff. adeno                                                                        3  C!                                C10  76 Colon ca                                                                             4.5 cm  None     Well diff. adeno                                                                       1  B!                                T1      Thyroid ca                                                            __________________________________________________________________________     Key:                                                                          IDC: infiltrating ductal carcinoma                                            ILC: infiltrating lobular carcinoma                                           Well diff. adeno: Well differentiated adenocarcinoma                          Mod diff. adeno: Moderately differentiated adenocarcinoma (SEQ ID NO: 13)     Letters in square brackets in Clinical Stage column refer to Dukes stagin     scheme for colon carcinoma                                               

REFERENCES

1. Stamenkovic, Amiot M, Pesando J. M, Seed B. A lymphocyte moleculeimplicated in lymph node homing is a member of the cartilage linkprotein family. Cell 1989; 56: 1057-062.

2. Birch M, Mitchell S, Hart I. R. Isolation and characterisation ofhuman melanoma cell variants expressing high and low levels of CD44.Cancer Res. 1991; 51: 6660-6667.

3. Gunthert U, Hofmann M, Rudy W, Reber S, Zoller M, HauBmann, Matzku S,Wenzel A, Ponta H, Herrlich P. A new variant of glycoprotein CD44confers metastatic potential to rat carcinoma cells. Cell 1991; 65:13-24.

4. Sy M S, Guo Y-J, Stamenkovic I. Distinct effects of two CD44 isoformson tumor growth in vivo. J. Exp. Med 1991; 174: 859-866.

5. Hofmann M, Rudy W, Zoller M, Tolg C, Ponta H, Herrlich P, Gunthert U.CD44 splice variants confer metastatic behaviour in rats: Homologoussequences are expressed in human tumor cell lines. Cancer Res. 1991; 51:5292-5297.

6. Stamenkovic I, Aruffo A, Amiot M, Seed B. The hematopoietic andepithelial forms of CD44 are distinct polypeptides with differentadhesion potentials for hyaluronate-bearing cells. EMBO J. 1991; 10:343-348.

7. Jackson D. G, Buckley J, Bell J. I. Multiple variants of the humanlymphocyte homing receptor CD44 generated by insertions at a single sitein the extracellular domain. J. Biol. Chem. 1992; 267: 4732-4739.

8. Chomzynski P, Sacchi N. Single-step method of RNA isolation by acidguanidinium thiocyantat-phenol-chloroform extraction. Anal Biochem.1987; 162: 156.

9. Knudson A. G. Hereditary cancer, oncogenes and antioncogenes. CancerRes. 1985; 45: 1437-43.

10. Tarin D. Tumour metastasis. In: Oxford Textbook of Pathology 1992;(eds: J O'DMcGee, N. A. Wright, P. G. Isaacson). Oxford UniversityPress, Oxford. pp 607-633.

11. Hayle A. J, Darling D. L, Taylor A. R, Tarin D. Transfection ofmetastatic capability with total genomic DNA from metastatic tumour celllines. Differentiation, 1993, in press.

12. Screaton G. R., Bell M. V., Jackson D. G., Cornelis F. B., Gerth U.,and Bell J. I., Genomic Structure of DNA encoding the lymphocyte homingreceptor CD44 reveals at least 12 alternatively spliced exons, Proc.Natl. Acad. Sci. USA, Vol 889, p 12160-4, December 1992, Immunology.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 13                                                 (2) INFORMATION FOR SEQ ID NO: 1:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 141 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: double                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA to mRNA                                              (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 7..135                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:                                      GCTACCACTTTGATGAGCACTAGTGCTACAGCAACTGAGACAGCAACC48                            ThrLeuMetSerThrSerAlaThrAlaThrGluThrAlaThr                                    1510                                                                          AAGAGGCAAGAAACCTGGGATTGGTTTTCATGGTTGTTTCTACCATCA96                            LysArgGlnGluThrTrpAspTrpPheSerTrpLeuPheLeuProSer                              15202530                                                                      GAGTCAAAGAATCATCTTCACACAACAACACAAATGGCTGGTACG141                              GluSerLysAsnHisLeuHisThrThrThrGlnMetAla                                       3540                                                                          (2) INFORMATION FOR SEQ ID NO: 2:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 43 amino acids                                                    (B) TYPE: amino acid                                                          (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:                                      ThrLeuMetSerThrSerAlaThrAlaThrGluThrAlaThrLysArg                              151015                                                                        GlnGluThrTrpAspTrpPheSerTrpLeuPheLeuProSerGluSer                              202530                                                                        LysAsnHisLeuHisThrThrThrGlnMetAla                                             3540                                                                          (2) INFORMATION FOR SEQ ID NO: 3:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 26 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:                                      GACACATATTGCTTCAATGCTTCAGC26                                                  (2) INFORMATION FOR SEQ ID NO: 4:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 29 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:                                      GATGCCAAGATGATCAGCCATTCTGGAAT29                                               (2) INFORMATION FOR SEQ ID NO: 5:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:                                      TGAGATTGGGTTGAAGAAATC21                                                       (2) INFORMATION FOR SEQ ID NO: 6:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 30 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:                                      CCTGAAGAAGATTGTACATCAGTCACAGAC30                                              (2) INFORMATION FOR SEQ ID NO: 7:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:                                      TGGATCACCGACAGCACAGAC21                                                       (2) INFORMATION FOR SEQ ID NO: 8:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:                                      TTGATGAGCACTAGTGCTACAGCA24                                                    (2) INFORMATION FOR SEQ ID NO: 9:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:                                      CATTTGTGTTGTTGTGTGAAGATG24                                                    (2) INFORMATION FOR SEQ ID NO: 10:                                            (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 21 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:                                     AGCCCAGAGGACAGTTCCTGG21                                                       (2) INFORMATION FOR SEQ ID NO: 11:                                            (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:                                     TCCTGCTTGATGACCTCGTCCCAT24                                                    (2) INFORMATION FOR SEQ ID NO: 12:                                            (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:                                     GACAGACACCTCAGTTTTTCTGGA24                                                    (2) INFORMATION FOR SEQ ID NO: 13:                                            (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 24 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: DNA (genomic)                                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:                                     TTCCTTCGTGTGTGGGTAATGAGA24                                                    __________________________________________________________________________

We claim:
 1. A method of diagnosis of neoplasia, which method comprisesanalyzing CD44 gene expression in a sample by the steps of makingcomplimentary DNA (cDNA) from messenger RNA (mRNA) in the sample,amplifying portions of the cDNA corresponding to the CD44 gene, anddetecting the amplified cDNA thereby diagnosing neoplasia.
 2. The methodas claimed in claim 1 wherein the sample is from a tissue which may be asolid tumour or from blood or other body fluid.
 3. The method as claimedin claim 1 wherein the sample is non-invasively obtained.
 4. A method asclaimed in claim 1 wherein a labelled specific oligonucleotide primer orprobe is used in detection of the cDNA.
 5. The method as claimed inclaim 1 wherein the amplification is carried out via polymerase chainreaction.
 6. The method as claimed in any one of claim 1 wherein theamplified cDNA is size separated by electrophoresis prior to detection.7. The method as claimed in claim 6, wherein blotting andautoradiography are performed on the separated cDNA.
 8. In pure form,the exon of CD44 having the nucleic acid sequence shown in FIG. 7 (SEQID NO:1), and a polynucleotide that is fully complementary thereto. 9.An oligonucleotide selected from the group consisting of5'-TTGATGAGCACTAGTGCTACAGCA (SEQ ID NO:8) and5'-CATTTGTGTTGTTGTGTGAAGATG (SEQ ID NO:9).